home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Internet Info 1993
/
Internet Info CD-ROM (Walnut Creek) (1993).iso
/
inet
/
internet-drafts
/
draft-shenker-realtime-model-00.txt
< prev
next >
Wrap
Text File
|
1993-10-26
|
115KB
|
2,296 lines
Internet Draft Scott Shenker
Expires: April 1994 Xerox PARC
File: draft-shenker-realtime-model-00.txt David D. Clark
MIT
Lixia Zhang
Xerox PARC
A Service Model for an Integrated Services Internet
October 22, 1993
Status of Memo
This document is an Internet-Draft. Internet-Drafts are working
documents of the Internet Engineering Task Force (IETF), its Areas,
and its Working Groups. Note that other groups may also distribute
working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six
months. Internet-Drafts may be updated, replaced, or obsoleted by
other documents at any time. It is not appropriate to use Internet-
Drafts as reference material or to cite them other than as a "working
draft" or "work in progress."
Abstract
The Internet is currently being confronted with service demands from
a new generation of applications. Supporting these applications
effectively and efficiently will require extending the current
Internet "best-effort" service model to one that offers an integrated
suite of services. The purpose of this memo (which is derived
primarily from [1]) is to describe a proposed "core" service model
for an integrated services Internet. In the Appendix we discuss the
process by which such a service model could be standardized by the
IETF.
1 Introduction
The current Internet offers a very simple service model: all packets
receive the same "best effort" service. The term "best effort" means
that the network tries to forward packets as soon as possible, but
makes no quantitative commitments about the quality of service
delivered. This service model can be realized by using a single FIFO
queue to do packet scheduling in the switches; in fact, this service
model arose precisely because FIFO packet scheduling, without
admission control, cannot efficiently deliver any other service
model. This single class "best effort" service model provides the
Shenker, Clark, Zhang [Page 1]
Internet Draft Integrated Service Architecture October 1993
same quality of service to all flows [Note 1] ; this uniform quality
of service is good, as measured by delay and dropped packets, when
the network is lightly loaded but can be quite poor when the network
is heavily utilized. Consequently, only those applications that are
rather tolerant of this variable service, such as file transfer
(e.g., FTP), electronic mail, and interactive terminals (e.g.,
Telnet) have become widely adopted in the Internet.
However, we expect there to soon be widespread demand for an emerging
generation of computer based applications, such as FAX, remote video,
multimedia conferencing, data fusion, remote X-terminals,
visualization, and virtual reality. These applications represent a
wide variety of quality of service requirements, ranging from the
asynchronous nature of FAX and electronic mail to the extremely
time-sensitive nature of high quality audio, and from the low
bandwidth requirements of Telnet to the bandwidth intensive
requirements of HDTV. To meet all of these service requirements
using the current Internet service model, it would be necessary (but
perhaps not sufficient) to keep the utilization level extremely low.
We think a better solution is to offer a more sophisticated service
model, so that applications can specify their service needs and the
network can then allocate its resources selectively towards those
applications that are more performance sensitive. It is important to
emphasize that this solution requires that applications explicitly
specify their service desires; these needs are not derived implicitly
by the network through the inspection of port numbers.
The service model is the enduring, and therefore the most
fundamental, part of a network architecture. The service model will
be incorporated into the network service interface used by future
applications; as such, it will define the set of services they can
request, and will therefore influence the design of future
applications as well as the performance of existing ones. Thus, the
service model should not be designed in reference to any specific
network artifact but rather should be based on fundamental service
requirements. While both the underlying network technology and the
overlying suite of applications will evolve, the need for
compatibility requires that this service interface remain relatively
stable. Actually, compatibility only demands that the existing parts
of the service model must remain largely unchanged; the service model
should be extensible with augmentations handled without difficulty.
Also, we should note that these compatibility arguments apply "only"
_________________________
[Note 1] "Flow" is the term we use to refer to end-to-end connections
and other more general varieties of traffic streams. We will return to
this issue in Section 2 where we discuss multicast flows more explicit-
ly.
Shenker, Clark, Zhang [Page 2]
Internet Draft Integrated Service Architecture October 1993
to those aspects of the service model which are part of the network
service interface; the service model will also have some components
(e.g., the link-sharing services as defined in Section 3) which are
exercised through a network management interface, and here the
compatibility arguments do not apply with nearly the same force.
This memo proposes a core service model for the Internet. We address
those services which relate most directly to the time-of-delivery of
packets. We do not address those services which are concerned with
which network links are used (which is the domain of routing) and
those services which involve encryption, security, authentication, or
transmission reliability. We also do not consider services, such as
reliable multicast, which do tangentially involve the time-of-
delivery but which more fundamentally involve other factors such as
buffer management and inter-switch acknowledgment algorithms.
Furthermore, we do not consider time-of-delivery services which can
best be delivered at the end host or by gateway switches at the edge
of the network, such as synchronization of different traffic streams.
Although many of the services listed above may perhaps be offered in
the future, we do not expect that they will affect the basic core of
the service model which we discuss here.
In order to efficiently support this more sophisticated service
model, Internet routers must employ an appropriately sophisticated
non-FIFO packet scheduling algorithm. In fact, the packet scheduling
algorithm is the most fundamental way in which the network can
allocate resources selectively; the network can also allocate
selectively via routing or buffer management algorithms, but neither
of these by themselves can support a sufficiently general service
model (see [10,6,7,15,9,11,13,29,18,19] for a few examples of packet
scheduling algorithms). However, packet scheduling algorithms are
only part of a complete mechanism to support explicit qualities of
service. In particular, since resources are finite, one cannot always
support an unbounded number of service requests. The network must
employ some form of admission algorithm so that it has control over
which service commitments are made. The admission process requires
that flows characterize their traffic stream to the network when
requesting service, and the network then determines whether or not to
grant the service request. It is important to keep in mind that
admission control plays a crucial role in allowing these scheduling
algorithms to be effective by keeping the aggregate traffic load down
to a level where meeting the service commitments is feasible (see
[7,20,22,24] for a few examples of admission control algorithms). In
fact, admission control is but one kind of denial of service; we will
discuss the several varieties of denial of service and their role in
allowing the scheduling algorithm to meet service commitments.
This memo has 2 sections. In Section 30 we identify the two kinds of
Shenker, Clark, Zhang [Page 3]
Internet Draft Integrated Service Architecture October 1993
service commitments we expect future networks to make; these are
quality of service commitments to individual flows and resource-
sharing commitments to collective entities. In Section 31 we explore
the service requirements of individual flows and then propose a
corresponding set of service models. In Section 3 we discuss the
service requirements for resource-sharing commitments to collective
entities, and propose a related service model. In Section 32, we
review the various forms denial of service can manifest, and the ways
in which denial of service can be used to augment the core service
model. We then conclude in Section 2 by discussing other viewpoints.
In an Appendix we discuss how the Internet community can standardize
a new service model.
2 Service Commitments
A service model is made up of service commitments; that is, a service
model describes what service the network commits to deliver in
response to a particular service request. In this section, we
describe the various different kinds of service commitments that are
included in our core service model.
Service commitments can be divided up into two classes, depending on
the way in which the service is characterized. One class of service
commitment is a quantitative or absolute service commitment, which is
some form of assurance that the network service will meet or exceed
the agreed upon quantitative specifications; a typical example of
this is a bound on maximal packet delay. The other class of service
commitment is a qualitative or relative service commitment, which is
merely some form of assurance about how one set of packets will be
treated relative to other sets of packets. One example of this kind
of relative service commitment is to offer several different priority
classes; the service in any priority class is not quantitatively
characterized, but there is a relative commitment to serve traffic in
a given priority class before traffic in lower priority classes.
Thus, when we say that the current Internet offers only a single
"best-effort" class of service, this is equivalent to saying that it
does not offer any quantitative service commitments, and only offers
the most trivial relative service commitment to treat all packets
equivalently. An important distinction between these two classes of
commitments is that quantitative service commitments often inherently
require some form of admission control, with the flow characterizing
its traffic in some manner; in contrast, relative service commitments
generally do not require any admission control.
Service commitments can also be divided into two categories depending
on the entities to which the commitments are made. The first
category of service commitments is the one most often considered in
the current literature; these are quality of service commitments to
Shenker, Clark, Zhang [Page 4]
Internet Draft Integrated Service Architecture October 1993
individual flows. In this case the network provides some form of
assurance that the quality of service delivered to the contracting
flow will meet or exceed the agreed upon specifications. The need
for these kinds of service commitments is usually driven by the
ergonomic requirements of individual applications. For instance, the
perceived quality of many interactive audio and video applications
declines dramatically when the delay of incoming packets becomes too
large; thus, these applications would perform better if the network
would commit to a small bound on the maximum packet queueing delay.
In Section 3 we discuss what quality of service commitments are
included in our core service model.
In contrast, the second category of service commitment we consider
has rarely been explicitly discussed in the research literature, even
though there is widespread agreement in the industry that there is
great customer demand for this feature; these are resource-sharing
commitments to collective entities. In this case, the network
provides an assurance that the resource in question will be shared
according to some prearranged convention among some set of collective
entities. These collective entities could, for example, be
institutions, protocol families, or application types. An example of
the need for such resource-sharing commitments is when two private
companies choose to jointly purchase a fiber optic link and then
elect to share the bandwidth in proportion to the capital investments
of the two companies. In Section 4, we present a more detailed
motivation for this form of service commitment and then discuss the
particular resource-sharing commitments that are part of our core
service model.
We should reiterate that because the quality of service service
commitments to individual flows will typically be invoked through the
service interface, compatibility requires that their definition
remain relatively stable. The resource sharing commitments will
typically be invoked through a network management interface, not
through the service interface used by applications, and therefore the
need for compatibility does not require such a stable service
definition.
3 Quality of Service Requirements and Service Models
In the previous section, we distinguished two sorts of service
requirements, quality of service requirements and resource sharing
requirements. In this section we consider quality of service
requirements. We first argue that packet delay is the key measure of
quality of service. We then present our assumptions about the nature
of future computer-based applications and their service requirements.
Finally, we describe a set of quality of service commitments designed
to meet these service requirements.
Shenker, Clark, Zhang [Page 5]
Internet Draft Integrated Service Architecture October 1993
3.1 The Centrality of Delay
There is one measure of service that is relevant to almost all
applications: per-packet delay. In some sense, delay is the
fundamental measure of the service given to a packet, since it
describes when (and if) a packet is delivered and, if we assume
that data is never corrupted (which we think is a good
approximation for the future Internet [Note 2] ), the time of
delivery is the only quantity of interest to applications. Delay
is clearly the most central quality of service, and we will
therefore start by assuming that the only qualities of service
about which the network makes commitments relate to per-packet
delay. Later, in Section 2 we will return to this point and ask
if the service model that results from this initial assumption is
sufficiently general.
In addition to restricting our attention to delay, we make the
even more restrictive assumption that the only quantity about
which we make quantitative service commitments are bounds on the
maximum and minimum delays. Thus, we have excluded quantitative
service commitments about other delay related qualities of
service, such as targets for average delay. This is based on
three judgments. First, controlling nonextremal values of delay
through packet scheduling algorithms is usually impractical
because it requires detailed knowledge of the actual load, rather
than just knowledge of the best and worst case loads. Second,
even if one could control nonextremal measures of packet delay for
the aggregate traffic in the network, this does not control the
value of such measures for individual flows; e.g., the average
delay observed by a particular flow need not be the same as, or
even bounded by, the average of the aggregate (see [33] for a
discussion of related issues). Thus, controlling nonextremal
measures of delay for the aggregate is not sufficient, and we
judge it impractical to control nonextremal measures of delay for
each individual flow. Third, as will be argued in the next
section, applications that require quantitative delay bounds are
more sensitive to the extremes of delay than the averages or other
statistical measures, so even if other delay related qualities of
service were practical they would not be particularly useful. We
discuss this in the section below when we discuss real-time
applications.
Why have we not included bandwidth as a quality of service about
_________________________
[Note 2] For those links where this is not a good approximation, such as
some wireless links, we expect there to be hop-by-hop error recovery so
that at the network level there is a low error rate.
Shenker, Clark, Zhang [Page 6]
Internet Draft Integrated Service Architecture October 1993
which the network makes commitments? This is primarily because,
for applications which care about the time-of-delivery of each
packet, the description of per-packet delay is sufficient. The
application determines its bandwidth needs, and these needs are
part of the traffic characterization passed to the network's
admission control algorithm; it is the application which then has
to make a commitment about the bandwidth of its traffic (when
requesting a quantitative service commitment from the network),
and the network in turn makes a commitment about delay. However,
there are some applications which are essentially indifferent to
the time-of-delivery of individual packets; for example, when
transferring a very long file the only relevant measure of
performance is the finish time of the transfer, which is almost
exclusively a function of the bandwidth. We discuss such
applications at the end of Section 2.
3.2 Application Delay Requirements
The degree to which application performance depends on low delay
service varies widely, and we can make several qualitative
distinctions between applications based on the degree of their
dependence. One class of applications needs the data in each
packet by a certain time and, if the data has not arrived by then,
the data is essentially worthless; we call these real-time
applications. Another class of applications will always wait for
data to arrive; we call these elastic applications. We now
consider the delay requirements of these two classes separately.
For the purposes of the discussion that follows, we assume that
all applications involve point-to-point communication, with all
packets requiring the same service. At the end of Section 2 we
discuss the case of multipoint-to-multipoint communication. In
Section 32 we address the case where some packets in a flow are
more important than others.
3.2.1 Real-Time Applications
An important class of such real-time applications, which are
the only real-time applications we explicitly consider in the
arguments that follow, are "playback" applications. In a
playback application, the source takes some signal, packetizes
it, and then transmits the packets over the network. The
network inevitably introduces some variation in the delay of
the delivered packets. This variation in delay has
traditionally been called "jitter". The receiver depacketizes
the data and then attempts to faithfully play back the signal.
This is done by buffering the incoming data to remove the
network induced jitter and then replaying the signal at some
Shenker, Clark, Zhang [Page 7]
Internet Draft Integrated Service Architecture October 1993
fixed offset delay from the original departure time; the term
"playback point" refers to the point in time which is offset
from the original departure time by this fixed delay. Any data
that arrives before its associated playback point can be used
to reconstruct the signal; data arriving after the playback
point is essentially useless in reconstructing the real-time
signal [Note 3] purposes of discussion, let us temporarily
assume that such playback applications have some intrinsic data
generation process that is unalterable; later in this section
we will return to this point.
In order to choose a reasonable value for the offset delay, an
application needs some "a priori" characterization of the
maximum delay its packets will experience. This "a priori"
characterization could either be provided by the network in a
quantitative service commitment to a delay bound, or through
the observation of the delays experienced by the previously
arrived packets; the application needs to know what delays to
expect, but this expectation need not be constant for the
entire duration of the flow.
The performance of a playback application is measured along two
dimensions: latency and fidelity. In general, latency is the
delay between the two (or more) ends of a distributed
application; for playback applications, latency is the delay
between the time the signal is generated at the source and the
time the signal is played back at the receiver, which is
exactly the offset delay. Applications vary greatly in their
sensitivity to latency. Some playback applications, in
particular those that involve interaction between the two ends
of a connection such as a phone call, are rather sensitive to
the value of the offset delay; other playback applications,
such as transmitting a movie or lecture, are not.
Fidelity is the measure of how faithful the playback signal is
to the original signal. The playback signal is incomplete when
packets arrive after their playback point and thus are dropped
rather than played back. The playback signal becomes distorted
when the offset delay is varied. Therefore, fidelity is
decreased whenever the offset delay is varied and whenever
packets miss their playback point. Applications exhibit a wide
range of sensitivity to loss of fidelity. We will consider two
somewhat artificially dichotomous classes: intolerant
_________________________
[Note 3] It is an oversimplification to say that the data is useless; we
discuss below that a receiving application could adjust the playback
point as an alternative to discarding late packets.
Shenker, Clark, Zhang [Page 8]
Internet Draft Integrated Service Architecture October 1993
applications, which require an absolutely faithful playback,
and tolerant applications, which can tolerate some loss of
fidelity [Note 4]. Intolerance to loss of fidelity might arise
because of user requirements (e.g., distributed symphony
rehearsal), or because the application hardware or software is
unable to cope with missing pieces of data. On the other hand,
users of tolerant applications, as well as the application
hardware and software, are prepared to accept occasional
distortions in the signal. We expect that the vast bulk of
audio and video applications will be tolerant.
Delay can affect the performance of playback applications in
two ways. First, the value of the offset delay, which is
determined by predictions about the future packet delays,
determines the latency of the application. Second, the delays
of individual packets can decrease the fidelity of the playback
by exceeding the offset delay; the application then can either
change the offset delay in order to play back late packets
(which introduces distortion) or merely discard late packets
(which creates an incomplete signal). The two different ways
of coping with late packets offer a choice between an
incomplete signal and a distorted one, and the optimal choice
will depend on the details of the application, but the
important point is that late packets necessarily decrease
fidelity.
Intolerant applications must use a fixed offset delay, since
any variation in the offset delay will introduce some
distortion in the playback. For a given distribution of packet
delays, this fixed offset delay must be larger than the
absolute maximum delay, to avoid the possibility of late
packets. In contrast, tolerant applications need not set their
offset delay greater than the absolute maximum delay, since
they can tolerate some late packets. Moreover, tolerant
applications can vary the offset delay to some extent, as long
as it doesn't create too much distortion.
Thus, tolerant applications have a much greater degree of
flexibility in how they set and adjust their offset delay. In
particular, instead of using a single fixed value for the
offset delay, they can attempt to reduce their latency by
varying their offset delays in response to the actual packet
_________________________
[Note 4] Obviously, applications lie on a continuum in their sensitivity
to fidelity. Here we are merely considering two cases as a pedagogical
device to motivate our service model, which indeed applies to the full
spectrum of applications.
Shenker, Clark, Zhang [Page 9]
Internet Draft Integrated Service Architecture October 1993
delays experienced in the recent past. We call applications
which vary their offset delays in this manner "adaptive"
playback applications (a more precise term is "delay-adaptive"
playback applications, to distinguish it from the "rate-
adaptive" playback applications we discuss below). This
adaptation amounts to gambling that the past packet delays are
good predictors of future packet delays; when the application
loses the gamble there is a momentary loss of data as packets
miss their playback points, but since the application is
tolerant of such losses the decreased offset delay may be worth
it. Besides the issue of inducing late packets, there is a
complicated tradeoff between the advantage of decreased offset
delay and the disadvantage of reduced fidelity due to
variations in the offset. Thus, how aggressively an
application adapts, or even if it adapts at all, depends on the
relative ergonomic impact of fidelity and latency. Our main
observation here, though, is that by adapting to the delays of
incoming packets, tolerant playback applications can often
profit by reducing their offset delay when the typical delays
are well below the absolute maximum; this advantage, of course,
is accompanied by the risk of occasional late packets.
So far we have assumed that an application's data generation
process is unalterable. However, there are likely to be many
audio and video applications which can adjust their coding
scheme and thus can alter the resulting data generation
process. This alteration of the coding scheme will present a
tradeoff between fidelity (of the coding scheme itself, not of
the playback process) and the bandwidth requirements of the
flow. Such "rate-adaptive" playback applications have the
advantage that they can adjust to the current network
conditions not just by resetting their playback point but also
by adjusting the traffic pattern itself.
We now state several of our assumptions about the nature of
future real-time applications. First, we believe that most
audio and video applications will be playback applications, and
we therefore think that playback applications will be the
dominant category of real-time traffic. By designing a service
model that is appropriate for these playback applications, we
think we will have satisfactorily (but perhaps not optimally)
met the needs of all real-time applications. Second, we
believe that the vast majority of playback applications will be
tolerant and that many, if not most, of these tolerant playback
applications will be adaptive. The idea of adaptive
applications is not relevant to circuit switched networks,
which do not have jitter due to queueing. Thus, most real-time
devices today, like voice and video codecs, are not adaptive.
Shenker, Clark, Zhang [Page 10]
Internet Draft Integrated Service Architecture October 1993
Lack of widespread experience may raise the concern that
adaptive applications will be difficult to build. However,
early experiments suggest that it is actually rather easy.
Video can be made to adapt by dropping or replaying a frame as
necessary, and voice can adapt imperceptibly by adjusting
silent periods. In fact, such adaptive approaches have been
employed in packetized voice applications since the early 70's
(see [34,36]); the VT [37], NEVOT [38], and VAT [] packet voice
protocols, which are currently used to transmit voice on the
Internet, are living examples of such adaptive applications.
Third, we believe that most playback applications will have
sufficient buffering to store packets until their playback
point. We base our belief on the fact that the storage needed
is a function of the queueing delays, not the total end-to-end
delay. There is no reason to expect that queueing delays for
playback applications will increase as networks get faster (in
fact, for an M/M/1 queueing system with a fixed utilization,
queueing delays are inversely proportional to the speed), and
it is certainly true that memory is getting cheaper, so
providing sufficient buffering will become increasingly
practical. Fourth, and last, we assume that applications have
sufficient knowledge about time to set the playback point. The
notion of a playback application implies that such applications
have some knowledge about the original generation time of the
data. This knowledge could either be explicitly contained in
timestamps, or an approximation could be implicitly obtained by
knowing the inter-packet generation intervals of the source.
3.2.2 Elastic Applications
While real-time applications do not wait for late data to
arrive, elastic applications will always wait for data to
arrive. It is not that these applications are insensitive to
delay; to the contrary, significantly increasing the delay of a
packet will often harm the application's performance. Rather,
the key point is that the application typically uses the
arriving data immediately, rather than buffering it for some
later time, and will always choose to wait for the incoming
data rather than proceed without it. Because arriving data can
be used immediately, these applications do not require any a
priori characterization of the service in order for the
application to function. Generally speaking, it is likely that
for a given distribution of packet delays, the perceived
performance of elastic applications will tend to depend more on
the average delay than on the tail of the distribution. One
can think of several categories of such elastic applications:
interactive burst (Telnet, X, NFS), interactive bulk transfer
Shenker, Clark, Zhang [Page 11]
Internet Draft Integrated Service Architecture October 1993
(FTP), and asynchronous bulk transfer (electronic mail, FAX).
The delay requirements of these elastic applications vary from
rather demanding for interactive burst applications to rather
lax for asynchronous bulk transfer, with interactive bulk
transfer being intermediate between them.
Some elastic applications, like Telnet, have an intrinsic data
generation process which is largely independent of network
conditions. However, there are many elastic applications, in
particular those involving bulk transfer, which can alter their
packet transmission process.
3.3 Delay Service Models
We now turn to describing service models that are appropriate for
the various classes of applications that were discussed in the
previous paragraphs. Since we are assuming that playback
applications comprise the bulk of the real-time traffic, we must
design service models for intolerant playback applications,
tolerant playback applications (which can be either adaptive or
non-adaptive), rate-adaptive playback applications (which can be
either tolerant or intolerant), and elastic applications.
The offset delay of intolerant playback applications must be no
smaller than the maximum packet delay to achieve the desired
faithful playback. Furthermore, this offset delay must be set
before any packet delays can be observed. Such an application can
only set its offset delay appropriately if it is given a perfectly
reliable [Note 5] upper bound on the maximum delay of each packet.
We call a service characterized by a perfectly reliable upper
bound on delay "guaranteed service", and propose this as the
appropriate service model for intolerant playback applications.
Note that the delay bound not only allows the application to set
its offset delay appropriately, but it also provides the
information necessary to predict the resulting latency of the
application.
Since such an intolerant playback application will queue all
packets until their respective playback points, application
performance is completely independent of when the packets arrive,
as long as they arrive within the delay bound. The fact that we
assume that there is sufficient buffering means that we need not
_________________________
[Note 5] By perfectly reliable, we mean that the bound is based on worst
case assumptions about the behavior of all other flows. The validity of
the bound is predicated on the proper functioning of all network
hardware and software along the path of the flow.
Shenker, Clark, Zhang [Page 12]
Internet Draft Integrated Service Architecture October 1993
provide a nontrivial lower bound to delay; only the trivial "no-
queueing" minimum delay will be given as part of the service
specification.
A tolerant playback application which is not adaptive will also
need some form of a delay bound so that it can set its offset
delay appropriately. Since the application is tolerant of
occasional late packets, this bound need not be perfectly
reliable. For this class of applications we propose a service
model called "predictive service" which supplies a fairly
reliable, but not perfectly reliable, delay bound. For this
service, the network advertises a bound which it has reason to
believe with great confidence will be valid, but cannot formally
"prove" its validity [Note 6] violated, the application's
performance will perhaps suffer, but the users are willing to
tolerate such interruptions in service in return for the presumed
lower cost of the service and lower realized delays [Note 7]
It is important to emphasize that this is "not" a statistical
bound, in that no statistical failure rate is provided to the
application in the service description. We do not think it
feasible to provide a statistical characterization of the delay
distribution because that would require a detailed statistical
characterization of the load. We do envision the network ensuring
the reliability of these predictive bounds, but only over very
long time scales; for instance, the network could promise that no
more than a certain fraction of packets would violate the
predictive bounds over the course of a month [Note 8] is not a
_________________________
[Note 6] This bound, in contrast to the bound in the guaranteed service,
is not based on worst case assumptions on the behavior of other flows.
Instead, this bound might be computed with properly conservative predic-
tions about the behavior of other flows.
[Note 7] For nonadaptive applications, the realized latency is lower
with predictive service since the fairly reliable bounds will be less
conservative than the perfectly reliable bounds of guaranteed service.
For adaptive applications, as we discuss below, the minimax component of
predictive service can, and we expect usually will, reduce the average
latency, i.e. the average value of the offset delay, to be well below
the advertised bound.
[Note 8] Such an assurance is not meaningful to an individual flow,
whose service over a short time interval might be significantly worse
than the nominal failure rate. We envision that such assurances would
be directed at the regulatory bodies which will supervise the adminis-
tration of such networks. However, we should note that there may very
well be pricing schemes which refund money if the service delivered to
an individual application doesn't meet some standard (such as a given
fraction of packets obey the delay bound); this is not a service commit-
Shenker, Clark, Zhang [Page 13]
Internet Draft Integrated Service Architecture October 1993
prediction of performance but rather a commitment to adjust its
bound-setting algorithm to be sufficiently conservative.
All nonadaptive applications, whether tolerant or not, need an "a
priori" delay bound in order to set their offset delay; the degree
of tolerance only determines how reliable this bound must be. In
addition to being necessary to set the offset delay, these delay
bounds provide useful estimates of the resulting latency.
Nonadaptive tolerant applications, like the intolerant
applications considered above, are indifferent to when their
packets arrive, as long as they arrive before the delay bound.
Recall, however, that we are assuming that many, if not most,
tolerant playback applications are adaptive. Thus, we must design
the service model with such adaptation in mind. Since these
applications will be adapting to the actual packet delays, a delay
bound is not needed to set the offset delay. However, in order to
choose the appropriate level of service, applications need some
way of estimating their performance with a given level of service.
Ideally, such an estimate would depend on the detailed packet
delay distribution. We consider it impractical to provide
predictions or bounds on anything other than the extremal delay
values. Thus, we propose offering the same predictive service to
tolerant adaptive applications, except that here the delay bound
is not primarily used to set the offset delay (although it may be
used as a hint) but rather is used to predict the likely latency
of the application.
The actual performance of adaptive applications will depend on the
tail of the delay distribution. We can augment the predictive
service model to also give "minimax" service, which is to attempt
to minimize the ex post maximum delay. This service is not trying
to minimize the delay of every packet, but rather is trying to
pull in the tail of the distribution. Here the fairly reliable
predictive delay bound is the quantitative part of the service
commitment, while the minimax part of the service commitment is a
relative service commitment. We could offer separate service
models for adaptive and nonadaptive tolerant playback
applications, with both receiving the predictive service as a
quantitative service commitment and with only adaptive
applications receiving the minimax relative commitment. However,
since the difference in the service models is rather minor, we
choose to only offer the combination of predictive and minimax
service.
_________________________
ment but rather a monetary one.
Shenker, Clark, Zhang [Page 14]
Internet Draft Integrated Service Architecture October 1993
It is clear that given a choice, with all other things being
equal, an application would perform no worse with absolutely
reliable bounds than with fairly reliable bounds. Why, then, do
we offer predictive service? The key consideration here is
efficiency [Note 9] ; when one relaxes the service requirements
from perfectly to fairly reliable bounds, this increases the level
of network utilization that can be sustained, and thus the price
of the predictive service will presumably be lower than that of
guaranteed service. The predictive service class is motivated by
the conjecture that the performance penalty will be small for
tolerant applications but the overall efficiency gain will be
quite large.
As we discussed above, both of these service models have a
quantitative component. In order to offer this service, the nature
of the traffic from the source must be characterized, and there
must be some admission control algorithm which insures that a
requested flow can actually be accommodated. This
characterization will not be a detailed statistical
characterization (we do not believe applications will be able to
provide those) but instead will be a worst-case characterization;
the flow will commit to not exceed some usage envelope or filter
(e.g., a token bucket or leaky bucket). A fundamental point of
our overall architecture is that traffic characterization and
admission control are necessary for these real-time delay bound
services. For rate-adaptive applications, these traffic
characterizations are not immutable. We can thus augment the
service model by allowing the network to notify (either implicitly
through packet drops or explicitly through control packets) rate-
adaptive applications to change their traffic characterization.
We will discuss this more thoroughly in Section 32.
The fourth category for which we must develop a service model is
elastic applications. Elastic applications are rather different
than playback applications; while playback applications hold
packets until their playback time, elastic applications use the
packet whenever it arrives. Thus, reducing the delays of any
packet tends to improve performance. Furthermore, since there is
no offset delay, there is no need for an "a priori"
characterization of the delays. An appropriate service model is
to provide "as-soon-as-possible", or ASAP service, which is a
relative, not quantitative, commitment [Note 10]. Elastic
_________________________
[Note 9] Efficiency can be thought of as the number of applications that
can be simultaneously serviced with a given amount of bandwidth; for a
fuller definition, see [,].
[Note 10] We choose not to use the term "best-effort" for the ASAP ser-
vice since that connotes the FIFO service discipline.
Shenker, Clark, Zhang [Page 15]
Internet Draft Integrated Service Architecture October 1993
applications vary greatly in their sensitivity to delay (which, as
we mentioned earlier, is probably more a function of the average
delay than of the maximum delay), and so the service model for
elastic traffic should distinguish between the various levels of
delay sensitivity. We therefore propose a multiclass ASAP service
model to reflect the relative delay sensitivities of different
elastic applications. This service model allows interactive burst
applications to have lower delays than interactive bulk
applications, which in turn would have lower delays than
asynchronous bulk applications. In contrast to the real-time
service models, this service model does not provide any
quantitative service commitment, and thus applications cannot
predict their likely performance and are also not subject to
admission control. However, we think that rough predictions about
performance, which are needed to select a service class, could be
based on the ambient network conditions and historical experience.
If the network load is unusually high, the delays will degrade and
the users must be prepared to tolerate this, since there was no
admission control to limit the total usage.
However, there may be some cases where an application (or the user
of the application) might want to know more precisely the
performance of the application in advance. For instance, a Telnet
user might want to ensure that the delays won't interfere with her
typing. For these cases, the application can request predictive
service (since the firmness of the guaranteed bound is probably
not required) provided it is willing to specify the maximum
transmission rate desired. Note that since the network will then
require compliance with the advertised transmission rate, the
application cannot get a higher throughput rate than what it
requested.
There are two issues regarding the elastic service model [Note 11]
that we do not address in this memo, and propose that these issues
be revisited once the rest of the core service model is defined.
First, there is the issue of relative treatment of flows. One
could treat each elastic packet independently, and allocate
service based merely on time-of-arrival and the level of ASAP
service requested. Alternatively, one could also attempt to
control the aggregate resources used by each individual flow, such
as is done in the Fair Queueing service model as described in [].
We do not address the relative treatment of various flows at this
time, since it will not affect the basic service interface.
_________________________
[Note 11] We have used the convenient, but perhaps confusing convention,
of referring to elastic service and real-time service when in fact the
terms real-time and elastic refer to a class of applications.
Shenker, Clark, Zhang [Page 16]
Internet Draft Integrated Service Architecture October 1993
Second, there is the issue of feedback. As we noted before, some
elastic applications can adjust their transmis pattern. This
adjustment can be in response to implicit signals, such a packet
drops or delay, or explicit signals such as congestion bits in the
packet header or separate control packets. Again, we do not
address at this time the form or content of such feedback signals
since they do not affect the basic service interface.
At the beginning of this section, we made the initial assumption
that delay was the only quality of service about which the network
needed to make commitments. We now revisit this issue and ask if
that is indeed the case. For the typical real-time or elastic
application which cares about the delays of individual packets,
there seems to be no need to include any other quality of service.
However, we observed earlier that there are some applications,
such as transfers of very long files, which are essentially
indifferent to the delays of individual packets and are only
concerned with overall delay of the transfer. For these
"indifferent" applications, bandwidth rather than delay is a more
natural characterization of the desired service, since bandwidth
dictates the application performance. If such an application has
no intrinsic overall delay requirement, then the desired service
is to finish the transfer as quickly as possible. The desired
service is "as-much-bandwidth-as-possible". By servicing packets
as soon as possible, the ASAP service described above delivers
exactly this as-much-bandwidth-as-possible service. Thus, while
we did not explicitly consider bulk transfer applications, our
proposed service model already provides the desired service for
bulk transfer applications with no intrinsic overall delay
requirements.
However, if this bulk transfer application had some intrinsic
overall delay requirement, i.e. it required the transfer to be
completed within a certain time, then the ASAP service is no
longer sufficient. Now, the appropriate service is to allow the
application to request a specified amount of bandwidth; the
application chooses this bandwidth amount so that the transfer
will be completed in time. An application can secure a given
amount of bandwidth through either of the real-time services. The
per-packet delay bounds provided by these real-time services are
superfluous to bulk transfer applications with overall delay
requirements. While one could imagine a different service which
provided a commitment on bandwidth but not per-packet delay, the
difference between requesting a large delay bound and no delay
bound is rather insignificant, and thus we expect that such
indifferent applications with delay requirements will be
adequately served by predictive service with very large delay
bounds. This has the disadvantage that indifferent applications
Shenker, Clark, Zhang [Page 17]
Internet Draft Integrated Service Architecture October 1993
with delay requirements do not get as-much-bandwidth-as-possible,
but are constrained to their reserved amount.
Applications
/ \
/ \
/ \
Elastic Real-Time
/ | \
/ | \ / \
/ | \ / \
/ | \ / \
Interactive Interactive Asynchronous Tolerant Intolerant
Burst Bulk Bulk | |
| | | | |
| | | | |
V V V V V
_________ _________ _________ __________ __________
| ASAP | | ASAP | | ASAP | |Predictive| |Guaranteed|
| Level 1 | | Level 2 | | Level 3 | | Minimax | | |
|_________| |_________| |_________| |__________| |__________|
Figure 1: Our rough taxonomy of applications and their associated
service models. We have arbitrarily depicted three levels of ASAP
service.
Figure depicts our taxonomy of applications and the associated
service models. This taxonomy is neither exact nor complete, but
was only used to guide the development of the core service model.
The resulting core service model should be judged not on the
validity of the underlying taxonomy but rather on its ability to
adequately meet the needs of the entire spectrum of applications.
In particular, not all real-time applications are playback
applications; for example, one might imagine a visualization
application which merely displayed the image encoded in each
packet whenever it arrived. However, non-playback applications
can still use either the guaranteed or predictive real-time
service model, although these services are not specifically
tailored to their needs. Similarly, playback applications cannot
be neatly classified as either tolerant or intolerant, but rather
fall along a continuum; offering both guaranteed and predictive
service allows applications to make their own tradeoff between
cost, fidelity, and latency. Despite these obvious deficiencies
in the taxonomy, we expect that it describes the service
requirements of current and future applications well enough so
Shenker, Clark, Zhang [Page 18]
Internet Draft Integrated Service Architecture October 1993
that our core service model can adequately meet all application
needs.
We have defined the core service model in terms of point-to-point
flows. We can easily generalize this service model to incorporate
multipoint-to-multipoint flows. Clearly for elastic traffic there
is no change to the service model. For the real-time service
models, the delay bounds are specified for each point-to-point
pair, and the traffic characterizations apply to the sum of the
flow's traffic at each hop along the flow's path.
4 Resource-Sharing Requirements and Service Models
The last section considered quality of service commitments; these
commitments dictate how the network must allocate its resources among
the individual flows. This allocation of resources is typically
negotiated on a flow-by-flow basis as each flow requests admission to
the network, and does not address any of the policy issues that arise
when one looks at collections of flows. To address these collective
policy issues, we now discuss resource-sharing service commitments.
Recall that for individual quality of service commitments we focused
on delay as the only quantity of interest. Here, we postulate that
the quantity of primary interest in resource-sharing is aggregate
bandwidth on individual links. Our reasoning for this is as follows.
Meeting individual application service needs is the task of quality
of service commitments; however, both the number of quantitative
service commitments that can be simultaneously made, and the
quantitative performance delivered by the relative service
commitments, depend on the aggregate bandwidth. Thus, when
considering collective entities we claim that we need only control
the aggregate bandwidth available to the constituent applications; we
can deal with all other performance issues through quality of service
commitments to individual flows. Embedded within this reasoning is
the assumption that bandwidth is the only scarce commodity; if
buffering in the switches is scarce then we must deal with buffer-
sharing explicitly, but we contend that switches should be built with
enough buffering so that buffer contention is not the primary
bottleneck.
Thus, this component of the service model, called "link-sharing",
addresses the question of how to share the aggregate bandwidth of a
link among various collective entities according to some set of
specified shares. There are several examples that are commonly used
to explain the requirement of link-sharing among collective entities.
Multi-entity link-sharing. -- A link may be purchased and used
jointly by several organizations, government agencies or the like.
They may wish to insure that under overload the link is shared in a
Shenker, Clark, Zhang [Page 19]
Internet Draft Integrated Service Architecture October 1993
controlled way, perhaps in proportion to the capital investment of
each entity. At the same time, they might wish that when the link is
underloaded, any one of the entities could utilize all the idle
bandwidth.
Multi-protocol link-sharing -- In a multi-protocol Internet, it may
be desired to prevent one protocol family (DECnet, IP, IPX, OSI, SNA,
etc.) from overloading the link and excluding the other families.
This is important because different families may have different
methods of detecting and responding to congestion, and some methods
may be more "aggressive" than others. This could lead to a situation
in which one protocol backs off more rapidly than another under
congestion, and ends up getting no bandwidth. Explicit control in the
router may be required to correct this. Again, one might expect that
this control should apply only under overload, while permitting an
idle link to be used in any proportion.
Multi-service sharing -- Within a protocol family such as IP, an
administrator might wish to limit the fraction of bandwidth allocated
to various service classes. For example, an administrator might wish
to limit the amount of real-time traffic to some fraction of the
link, to avoid preempting elastic traffic such as FTP.
In general terms, the link-sharing service model is to share the
aggregate bandwidth according to some specified shares; however, one
must be careful to state exactly what this means. The following
example will highlight some of the policy issues implicit in link-
sharing. Consider three firms, 1, 2, and 3, who respectively have
shares 1/4, 1/4, and 1/2 of some link. Assume that for a certain
hour, firm 1 sends no traffic to the link while firms 2 and 3 each
send enough to use the entire capacity of the link. Are firms 2 and
3 restricted to only using their original shares of the link, or can
they use firm 1's unused bandwidth? Assume for now that they are
allowed to use firm 1's unused bandwidth. Then, how is firm 1's
share of the link split between firms 2 and 3? If, in the next
twenty minutes, all three firms each send enough traffic to consume
the entire link, is the link allocated solely to firm 1 in order to
make up for the imbalance in aggregate bandwidth incurred during the
first hour, or is the link shared according to the original shares?
Thus, there are three policy questions to be resolved: can firms use
each other's unused bandwidth, how is this unused bandwidth allocated
to the remaining firms, and over what time scale is the sharing of
bandwidth measured? Clearly the answer to the first question must be
affirmative, since much of the original motivation for link-sharing
is to take advantage of the economies of statistical aggregation. As
for the second question, one can imagine many rules for splitting up
the excess bandwidth but here we propose that the excess is assigned
in proportion to the original shares so that in the above example
Shenker, Clark, Zhang [Page 20]
Internet Draft Integrated Service Architecture October 1993
during the first hour the link would be split 1/3, 2/3 for firms 2
and 3 respectively. The answer to the third question is less clear.
The preceding example indicates that if sharing is measured over some
time scale T then a firm's traffic can be halted for a time on the
order of T under certain conditions; since such cessation should be
avoided, we propose doing the sharing on an instantaneous basis
(i.e., the limit of T going to zero). This would dictate that during
this next twenty minutes the bandwidth is split exactly according to
the original shares: 1/4, 1/4, and 1/2. This policy embodies a
"use-it-or-lose-it" philosophy in that the firms are not given credit
at a later date for currently unused bandwidth.
An idealized fluid model of instantaneous link-sharing with
proportional sharing of excess is the fluid processor sharing model
(introduced in [] and further explored in [29,18]) where at every
instant the available bandwidth is shared between the active entities
(i.e., those having packets in the queue) in proportion to the
assigned shares of the resource. More specifically, we let mu be the
speed of the link and we give each entity i its own virtual queue
which stores its packets as they await service. For each entity i we
define the following quantities: s_i, the share of the link; c_i(t),
the cumulative number of bits in the traffic stream that have arrived
by time t; and the backlog b_i(t), the number of bits remaining in
the virtual queue at time t. Whenever a real packet arrives at the
switch belonging to entity i, we place a corresponding idealized
packet at the tail of that entity's virtual queue. The service
within each such virtual queue is FIFO. We now describe how service
is allocated among the different virtual queues. The idealized
service model is defined by the equations:
b_i'(t) = c_i' - min( S_i*L, c_i' ) if b_i(t) = 0 (1)
and
b_i'(t) = c_i' - s_i*L if b_i(t) > 0 (2)
where prime represents differentiation with respect to time and L
is the unique constant that makes:
(Sum over i of b_i' ) = Mu - ( Sum over i of c_i' );
when no such value exists, we set L to infinity.
At every instant the excess bandwidth, that is the bandwidth left
over from flows not using their entire share of bandwidth, is split
among the active entities (i.e., those with bi>0) in proportion to
their shares; each active [Note 12] entity receives an instantaneous
_________________________
Shenker, Clark, Zhang [Page 21]
Internet Draft Integrated Service Architecture October 1993
bandwidth that is greater than or equal to their share of the full
transmission rate.
This fluid model exhibits the desired policy behavior but is, of
course, an unrealistic idealization. We then propose that the actual
service model should be to approximate, as closely as possible, the
bandwidth shares produced by this ideal fluid model. It is not
necessary to require that the specific order of packet departures
match those of the fluid model since we presume that all detailed
per-packet delay requirements of individual flows are addressed
through quality of service commitments and, furthermore, the
satisfaction with the link-sharing service delivered will probably
not depend very sensitively on small deviations from the scheduling
implied by the fluid link-sharing model. The link-sharing service
model provides quantitative service commitments on bandwidth shares
that the various entities receive.
Heretofore we have considered link-sharing across a set of entities
with no internal structure to the entities themselves. However, the
various sorts of link-sharing requirements presented above could
conceivably be nested into a hierarchy of link-sharing requirements,
an idea first proposed by Jacobson and Floyd [11]. For instance, a
link could be divided between a number of organizations, each of
which would divide the resulting allocation among a number of
protocols, each of which would be divided among a number of services.
We propose extending the idealized link-sharing service model
presented above to the hierarchical case. The policy desires will be
represented by a tree with shares assigned to each node; the shares
belonging to the children of each node must sum to the share of the
node, and the top node represents the full link and has a unit share.
Furthermore, each node has an arrival stream described by ci(t) and a
backlog b_i(t) with the quantities of the children of each node
summing to the quantity of the node. Then, at each node we invoke
the fluid processor sharing model among the children, with the
instantaneous link speed at the i'th node, mu_i(t), set equal to the
rate b_i'(t) at which bits are draining out of that node's virtual
queue. We can start this model at the top node; when propagated down
to the leaf nodes, or bottom-level entities, this determines the
idealized service model.
The introduction of a hierarchy raises further policy questions which
are illustrated by the following example depicted in Figure `a' and
`b'. Let us assume that each of the bottom-level entities, 1a, 1b, 2a
and 2b, has a 1/4 share of the link. When all of the bottom-level
_________________________
[Note 12] There are three states a flow can be in: active (b_i>0), inac-
tive (b_i=0 and c_i'=0), and in-limbo (b_i=0 but c_i'>0).
Shenker, Clark, Zhang [Page 22]
Internet Draft Integrated Service Architecture October 1993
entities are sending enough to consume their share, the bandwidth is
split exactly according to these shares. Now assume that at some
instant there is no offered 2b traffic. Should each of 1a,1b and 2a
get 1/3 of the link, or should 1a and 1b continue to get 1/4, with 2a
getting the remaining 1/2 share of the link which is the total of the
shares belonging to firm 2? This is a policy question to be
determined by the firms, so the service model should allow either.
Figure depicts two possible sharing trees. Tree #1 in the figure
produces the 1/4, 1/4, 1/2 sharing whereas tree #2 produces the 1/3,
1/3, 1/3 sharing. When the link-sharing service commitment is
negotiated, it will be specified by a tree and an assignment of
shares for the nodes.
Link
/ \ Link
/ \
/ \ / / \ \
1 2 / / \ \
/ \ / \ / | | \
/ \ / \ / | | \
1a 1b 2a 2b 1a 1b 2a 2b
Tree #1 Tree #2
Figure 2: Two possible sharing trees with equal shares at
all leaf nodes. When one of the leaf nodes is not active,
the trees produce different bandwidth shares for the
remaining active nodes.
In the hierarchical model, the bandwidth sharing between the children
of a given node was independent of the structure of the
grandchildren. One can think of far more general link-sharing
service models. Assume that in the example above that protocol `a'
carries traffic from applications with tight delay requirements and
protocol `b' carries traffic from applications with loose delay
requirements. The two firms might then want to implement a sharing
policy that when 1a is not fully using its share of the link, the
excess is shared equally among 1b and 2a, but when 1b is not fully
using its share of the link we will give the excess exclusively to
1a. To implement this more complicated policy, it is necessary to
take the grandchildren structure into account. We think that this
sort of flexibility is probably not needed, for the same reason that
we restricted ourselves to bandwidth as the only collective concern;
quality of service issues should be addressed via quality of service
commitments and not through the link-sharing service model. For this
same reason, we do not make priority distinctions between the various
Shenker, Clark, Zhang [Page 23]
Internet Draft Integrated Service Architecture October 1993
nodes, but merely allocate shares of bandwidth. Therefore, for our
resource-sharing service model we restrict ourselves to the
hierarchical service model presented above.
In Section 31 we observed that admission control was necessary to
ensure that the real-time service commitments could be met.
Similarly, admission control will again be necessary to ensure that
the link-sharing commitments can be met. For each bottom-level
entity, admission control must keep the cumulative guaranteed and
predictive traffic from exceeding the assigned link-share.
5 Denial of Service
To meet its quantitative service commitments, the network must employ
some form of admission control. Without the ability to deny flows
admission to the network, one could not reliably provide the various
delay bound services offered by our service model. In fact,
admission control is just one aspect of denial of service; there are
several other ways in which service can be denied. Denial of
service, in all of its incarnations, plays a fundamental role in
meeting quantitative service commitments. In particular, denial of
service can be used to augment the resource sharing portion of the
core service model by supporting utilization targets. Moreover,
denial of service, through the use of the preemptable and expendable
service options discussed below, can enable the network to meet its
service commitments while still maintaining reasonably high levels of
network utilization.
Denial of service, like service commitments, can occur at various
levels of granularity. Specifically, denial of service can apply to
whole flows, or to individual packets within a flow. We discuss
these two cases separately.
5.1 Denial to Flows
Denial of service to a flow can occur either before or during the
lifetime of that flow. Denying service to a flow before it enters
the network is typically referred to as admission control. As we
envision it, in order to receive either of the two real-time
bounded delay services (guaranteed and predictive), a flow will
have to explicitly request that service from the network, and this
request must be accompanied by a worst-case characterization of
the flow's traffic stream. This characterization gives the
network the information necessary to determine if it can indeed
commit to providing the requested delay bounds. The request is
denied if the network determines that it cannot reliably provide
the requested service. References [7,20,22,24] discuss various
approaches to admission control.
Shenker, Clark, Zhang [Page 24]
Internet Draft Integrated Service Architecture October 1993
In addition, a service model could offer a "preemptable" flow
service, presumably for a lower cost than non-preemptable service.
When the network was in danger of not meeting some of its
quantitative service commitments, or even if the network was
merely having to deny admission to other flows, then it could
exercise the "preemptability option" on certain flows and
immediately discontinue service to those flows by discarding their
packets (and, presumably, sending a control message informing
those flows of their termination). By terminating service to
these preemptable flows, the service to the flows that are
continuing to receive service will improve, and other non-
preemptable flows can be admitted.
Recall that rate-adaptive flows are able to adjust their
transmission rate. For these flows we can offer an "adjustable"
flow service, again presumably for a lower cost than the regular
non-preemptable, non-adjustable service. When the network was in
danger of not meeting some of its quantitative service
commitments, or even if the network was merely having to deny
admission to other flows, then it could exercise the
"adjustability option" of these flows and request that they reduce
their transmission rate. Similarly, when the network had spare
capacity, it could inform these flows that they could increase
their transmission rate.
Admission control can be used to augment the link-sharing service
model described in the previous section. Link-sharing uses packet
scheduling to provide quantitative service commitments about
bandwidth shares. This service is designed to provide sharing
between various entities which have explicitly contracted with the
network to manage that sharing. However, there are other
collective policy issues that do not involve institutional
entities, but rather concern overall utilization levels of the
various service classes (guaranteed, predictive, ASAP). Because
they are not explicitly negotiated, and so no service commitments
are at stake, these utilization levels are not controlled by
packet scheduling but instead are controlled by the admission
control algorithm. All real-time flows are subject to scrutiny by
the admission control process; only those flows that are accepted
can use the network. If the admission control algorithm used the
criteria that a flow was accepted if and only if it could be
accepted without violating other quality of service commitments,
then the utilization levels of the various classes will depend
crucially on the order in which the service requests arrived to
the network. One might desire, instead, to make explicit policy
choices about these various level of utilization. For instance,
it is probably advisable to prevent starvation of any particular
class of traffic; an explicit control would be needed to prevent
Shenker, Clark, Zhang [Page 25]
Internet Draft Integrated Service Architecture October 1993
starvation of elastic traffic since the ASAP service does not
involve resource reservation. In addition, one might want the
admissions process to ensure that requests for large amounts of
bandwidth were not always squeezed out by numerous smaller
requests.
To prevent such problems, we must introduce some guidelines,
called "utilization targets", into the admission control algorithm
so that the utilization levels are not just dependent on the
details of the load pattern but instead are guided towards some
preferred usage pattern. This utilization target service model
involves only admission control; thus, it is not properly part of
the core service model. We mention utilization targets here
because other aspects of the core service model rely on these
utilization targets, and also because it is so similar to the
link-sharing model, in that it represents policy objectives for
aggregated classes of traffic.
5.2 Denial To Packets
While denial of service is usually associated with admission
control, it also can be performed on a packet-by-packet
granularity. Denial of service to individual packets could occur
by means of a "preemptable" packet service, whereby flows would
have the option of marking some of their packets as preemptable.
When the network was in danger of not meeting some of its
quantitative service commitments, it could exercise a certain
packet's "preemptability option" and discard the packet (not
merely delay it, since that would introduce out-of-order
problems). By discarding these preemptable packets, the delays of
the not-preempted packets will be reduced.
The basic idea of allowing applications to mark certain packets to
express their "drop preference" and then having the network
discard these packets if the network is congested has been
circulating in the Internet community for years, and has been
simulated in Reference []. The usual problem in such a scheme is
defining what congestion means. In the Internet, with its simple
service model, one usually equates congestion with the presence of
a sizable queue. However, this is a network-centric definition
that is not directly related to the quality of service desired by
the various applications. In contrast, in our setting, we can
make a very precise definition of congestion that is directly tied
to the applications' service requirements: congestion is when some
of the quantitative service commitments are in danger of being
violated. The goal of admission control is to ensure that this
situation arises extremely infrequently.
Shenker, Clark, Zhang [Page 26]
Internet Draft Integrated Service Architecture October 1993
The basic idea of preemptability can usefully be extended in two
directions. First, for the purposes of invoking the
preemptability options, one can stretch the definition of a
quantitative service commitment to include implicit commitments
such as compliance with the historical record of performance.
That is, one could choose to drop packets to make sure that the
network continued to provide service that was consistent with its
past history, even if that past history was never explicitly
committed to. Furthermore, one could also extend the definition
of a quantitative service commitment to the utilization targets
discussed above.
Second, one can define a class of packets which are not subject to
admission control. In the scenario described above where
preemptable packets are dropped only when quantitative service
commitments are in danger of being violated, the expectation is
that preemptable packets will almost always be delivered and thus
they must included in the traffic description used in admission
control. However, we can extend preemptability to the extreme
case of "expendable" packets (the term expendable is used to
connote an extreme degree of preemptability), where the
expectation is that many of these expendable packets will not be
delivered. One can then exclude expendable packets from the
traffic description used in admission control; i.e., the packets
are not considered part of the flow from the perspective of
admission control, since there is no commitment that they will be
delivered. Such expendable packets could be dropped not only when
quantitative service commitments are in danger of being violated,
but also when implicit commitments and utilization targets, as
described above, are in danger of being violated.
The goal of these preemptable and expendable denial of service
options (both at the packet and flow level of granularity) is to
identify and take advantage of those flows that are willing to
suffer some interruption of service (either through the loss of
packets or the termination of the flow) in exchange for a lower
cost. The preemptable flows and packets provide the network with
a margin of error, or a "cushion", for absorbing rare statistical
fluctuations in the load. This will allow the network to operate
at a higher level of utilization without endangering the service
commitments made to those flows who do not choose preemptable
service. Similarly, expendable packets can be seen as "filler"
for the network; they will be serviced only if they do not
interfere with any service commitment but there is no expectation
that their being dropped is a rare event. This will increase the
level of utilization even further. We will not specify further
how these denial of service, or preemptability, options are
defined, but clearly there can be several levels of
Shenker, Clark, Zhang [Page 27]
Internet Draft Integrated Service Architecture October 1993
preemptability, so that an application's willingness to be
disrupted can be measured on more than a binary scale.
6 Alternative Viewpoints
In this section, we discuss several other viewpoints on the problem
of providing integrated services.
6.1 Scheduling Algorithms vs. Service Models
The motivating principle of this memo is that the service model is
primary. However, one could contend that because we do not yet
know the service needs of future applications, the most important
goal is to design flexible and efficient packet scheduling
implementations. Obviously both packet scheduling implementations
and service models are tremendously important, but the debate here
is over which one should guide the design of the Internet. There
are three points to be made.
First, the service model must be made explicit to application
designers. Currently, there are a rather limited number of
network-intensive applications; the network can, to a large
extent, determine the service requirements of a packet by
inspecting the port number. However, as the variety of network-
intensive applications increases, and as the service requirements
of these applications begin to depend on the user's personal
demands (e.g., high and low priority mail, high and low quality
video from the same codec, etc.), port numbers will no longer be
sufficient to identify service requirements. Rather than having
the network implicitly deliver the appropriate service, the
applications will have to explicitly request the desired service.
For this to happen, the service model must be made explicit (so
that application designers know about it), and it obviously must
remain relatively stable; the service model should not just be
implicitly defined by the packet scheduling implementation. Thus,
regardless of whether the packet scheduling algorithm is flexible
or not, the service model must be made explicit and remain
relatively stable.
Second, there is a fundamental difference in the time-scale over
which packet scheduling implementations and service models have
impact. Once a router vendor with a substantial market presence
adopts a new packet scheduling implementation, it will likely
remain fixed for several years. So, in the short term, we need to
ensure that such packet scheduling implementations embody enough
flexibility to adapt if a new service model is adopted, or the
current service model is extended, during the product's lifetime.
However, router technology, and the embedded packet scheduling
Shenker, Clark, Zhang [Page 28]
Internet Draft Integrated Service Architecture October 1993
implementations, do evolve as new products are introduced, and so
one cannot expect that packet scheduling implementations will
remain fixed for many years. On the other hand, the time scale of
service models is rather different. It typically takes much
longer for a new service model to become adopted and utilized,
because it must be embedded in user applications. However, once a
service model does become adopted it is much harder to change, for
precisely the same reason. Thus, we can say that while the set of
packet scheduling implementations will likely freeze first, the
service model freezes harder. For this reason we choose to focus
on the service model.
Third, the role of flexibility must be clarified. The services
offered to individual flows by a packet scheduling algorithm must
be part of a service model and, as we argued above, the service
model does not change rapidly (except in experimental networks,
where perhaps using flexible and efficient packet scheduling
implementations is important); in particular, we expect service
models to change much less rapidly than packet scheduling
algorithms. Thus, for quality of service commitments to
individual flows, flexibility is not of great importance.
However, the link-sharing portion of the service model is not
exercised by individual applications but rather by network
managers through some network management interface. This portion
of the service model can change much more rapidly, so flexibility
is indeed important for link-sharing and other forms of resource
sharing. The debate over the relative importance of service
models and packet scheduling implementations reflects, at least in
part, a deeper disagreement over the extent to which quality of
service needs are met indirectly by link-sharing, which controls
the aggregate bandwidth allocated to various collective entities,
as opposed to being met directly by quality of service commitments
to individual flows. Actually, the important distinction here is
not between link-sharing and delay related services, but rather
between those services which require explicit use of the service
interface, and those that are delivered implicitly (i.e., based on
information automatically included in the packet header such as
port numbers). Network architectures designed around such
implicit quality of service mechanisms do not require a well-
defined service model; the network architecture we have advocated
involves explicit quality of service mechanisms and therefore
requires a stable service model.
6.2 Why Use Admission Control?
Real-time service plays a central role in the proposed service
model. We should note that there is another viewpoint on this
issue, which has not yet been adequately articulated in the
Shenker, Clark, Zhang [Page 29]
Internet Draft Integrated Service Architecture October 1993
literature. It is conceivable that the combination of adaptive
applications and sufficient overprovisioning of the network could
render such delay bounds, with the associated need for admission
control, unnecessary; applications could adapt to current network
conditions, and the overprovisioning would ensure that the network
was very rarely overloaded [Note 13] In this view, it would be
sufficient to provide only the several classes of ASAP service
without "any" real-time services. The first question is, can one
indeed overprovision the network so that it is extremely rarely
overloaded? It is true that the statistical demand in the phone
network is well characterized, and overprovisioning has become a
finely honed art. However, there are three crucial differences
between the phone network and the Internet which lead us to the
conclusion that the extreme variability of the offered load will
require too great a degree of overprovisioning to make this
approach practical. First, we do not expect the usage patterns on
the Internet to be nearly so well characterized. In the phone
network the usage patterns tend to revolve around human behavior,
which changes rather slowly. However, in the Internet, the
transfer of a few file repositories can create a dramatic and
immediate shift in traffic patterns. Second, the variability in
usage of an individual phone user is quite limited. In contrast,
computer network usage can easily vary by three orders of
magnitude, from 64kbps voice to 100mbps HDTV. Even if the law of
large numbers does apply, the intrinsic variance of the individual
distributions means that the resulting variance of the aggregate
usage will be several orders of magnitude bigger than in the phone
network. Third, regardless of price, there are natural intrinsic
limits to the maximum bandwidth demand on the phone network: every
person and FAX machine placing a call simultaneously. In
contrast, there is no reason to expect that if bandwidth were
sufficiently cheap there would be limits to the bandwidth
consumption on the Internet (think of having video-on-demand
everywhere). Thus, unless we use excessively high prices to
artificially lower demand, we doubt we can overprovision the
network so that it is extremely rarely overloaded.
Given that overloads will occur if no admission control is used,
the second question is: can applications adequately adapt to these
overloaded conditions, or should we use admission control to
prevent these overloads from occurring? Even if one assumes that
adaptation is done instantaneously (so that there are no transient
periods where the offset delays are incorrectly set), there is the
basic question of whether the user population would be happier all
_________________________
[Note 13] Of course, this viewpoint is predicated on the nonexistence of
applications which have hard real-time requirements.
Shenker, Clark, Zhang [Page 30]
Internet Draft Integrated Service Architecture October 1993
sharing an overloaded network, or would they prefer having some
users turned away. For typical elastic applications such as
Telnet, it is most likely preferable to share the overloaded
network. For typical real-time applications such as remote
interactive video, we conjecture that it is preferable to turn
some users away because the rapid increase in delays and packet
drops as a function of load causes severe degradation of
application performance even for adaptive applications. In
short, the ability to adapt to worse conditions does not mean that
appl
A common counterargument to our line of reasoning is that users
will be unhappy with any network that denies service with any
significant frequency, and so we are merely trading off the
unhappiness with overloading for the unhappiness caused by denial
of service. While users may expect very low rates of denial for
low-bandwidth applications like voice, there will not likely be
the same expectation for extremely bandwidth intensive
applications like HDTV. We expect that it will be rather easy,
and fairly efficient (i.e., result in a reasonably high
utilization level), to provision the network so that it can easily
accept almost all phone calls, but will occasionally turn away
much larger bandwidth requests.
6.3 Variations on the Service Model
There are other approaches to defining real-time service. The
real-time service advocated here provides a bound on the maximum
delay of packets, provided that the application's traffic load
conforms to some prearranged filter. One could provide not only a
bound on the maximum delay but also a nontrivial bound (i.e., a
bound other than the no-queueing bound) on the minimum delay. We
did not include such nontrivial lower bounds on delay in our
present service model because they serve only to reduce buffering
at the receiver and we do not expect buffers to be a bottleneck;
furthermore, if some applications do need additional buffering,
this can easily be supplied at the edge of the network and need
not be built into the basic core service model.
A rather different form of service model is to offer statistical
characterizations of performance. We explicitly reject such
statistically characterized service offerings because they
inherently require a statistical characterization of individual
flows (or at least of the aggregate traffic), and we doubt that
such characterizations will be available. Instead, we rely only
on worst-case characterizations of the flows.
Finally, one can define different link-sharing service models; in
Shenker, Clark, Zhang [Page 31]
Internet Draft Integrated Service Architecture October 1993
particular, as discussed in [], one can incorporate priorities
between entities into the link-sharing service model (the model
presented here does include priorities in a single entity's
traffic, but not between entities). We do not include this
feature for two reasons. First, a basic principle of this service
model is that the quality of service requirements of individual
applications should be addressed primarily through explicit
service requests. Second, and much more importantly, the priority
features will not produce dramatically different delay behaviors
unless the traffic is very close to the bandwidth limits imposed
by link-sharing.
7 Acknowledgments
We would like to acknowledge that the thoughts discussed in this memo
reflect the contributions of many others. In particular, the works
of Parekh and Gallager [29,18], Ferrari et al. [6,,7,19], Jacobson
and Floyd [,11,], Golestani [15,9], Guerin et al. [,20], Kurose et
al. [,,33,,], Lazar et al. [,,22,], and Kalmanek et al. [13] have
been critical in shaping our thinking on this matter. Discussions
with our ISIP collaborators, the End-to-End Services Research Group,
the authors of the above works, and many of our other colleagues have
also been instrumental in clarifying our thoughts. In particular,
Abhay Parekh has taught us much about the delay bound results in
[29,18]. Also, Sally Floyd and Van Jacobson have rightly insisted
that packet scheduling algorithms must deal with packet dropping and
hierarchical link-sharing; we wish to acknowledge that much of our
thinking on the hierarchical nature of link-sharing was stimulated
by, and borrows heavily from, their work.
REFERENCES
[1] S. Casner. "private communication", 1992.
[2] D. Clark and V. Jacobson. "Flexible and Efficient Resource
management for Datagram Networks", unpublished draft, 1991.
[3] D. Clark, S. Shenker, and L. Zhang. "Supporting Real-Time
Applications in an Integrated Services Packet Network: Architecture
and Mechanism", in Proceedings of SIGCOMM '92, pp 14-26, 1992.
[4] R. Chipalkatti, J. Kurose, and D. Towsley. "Scheduling Policies for
Real-Time and Non-Real-Time Traffic in a Statistical Multiplexer",
in Proceedings of GlobeCom '89, pp 774-783, 1989.
[5] R. Cocchi, D. Estrin, S. Shenker, and L. Zhang. "A Study of
Priority Pricing in Multiple Service Class Networks", in Proceedings
of SIGCOMM '91, pp 123-130, 1991.
Shenker, Clark, Zhang [Page 32]
Internet Draft Integrated Service Architecture October 1993
[6] R. Cocchi, D. Estrin, S. Shenker, and L. Zhang. "Pricing in
Computer Networks: Motivation, Formulation, and Example", preprint,
1992.
[7] A.~Demers, S.~Keshav, and S.~Shenker. "Analysis and Simulation of a
Fair Queueing Algorithm", In Journal of Internetworking: Research
and Experience, 1, pp. 3-26, 1990. Also in Proc. ACM SIGCOMM '89,
pp 3-12.
[8] J. DeTreville and D. Sincoskie. "A Distributed Experimental
Communications System", In IEEE JSAC, Vol. 1, No. 6, pp 1070-1075,
December 1983.
[9] D. Ferrari. "Client Requirements for Real-Time Communication
Services", In IEEE Communications Magazine, 28(11), November 1990.
[10] D. Ferrari. "Distributed Delay Jitter Control in Packet-Switching
Internetworks", In Journal of Internetworking: Research and
Experience, 4, pp. 1-20, 1993.
[11] D. Ferrari, A. Banerjea, and H. Zhang "Network Support for
Multimedia", preprint, 1992.
[12] D. Ferrari and D. Verma. "A Scheme for Real-Time Channel
Establishment in Wide-Area Networks", In IEEE JSAC, Vol. 8, No. 3,
pp 368-379, April 1990.
[13] S. Floyd. "Link-sharing and Resource Management Models for Packet
Networks", preprint, 1993.
[14] S. J. Golestani. "A Stop and Go Queueing Framework for Congestion
Management", In Proceedings of SIGCOMM '90, pp 8-18, 1990.
[15] S. J. Golestani. "Duration-Limited Statistical Multiplexing of
Delay Sensitive Traffic in Packet Networks", In Proceedings of
INFOCOM '91, 1991.
[16] R. Gu'erin and L. G"un. "A Unified Approach to Bandwidth
Allocation and Access Control in Fast Packet-Switched Networks", In
Proceedings of INFOCOM '92.
[17] R. Gu'erin, H. Ahmadi, and M. Naghshineh. "Equivalent Capacity and
Its Application to Bandwidth Allocation in High-Speed Networks", In
IEEE JSAC, Vol. 9, No. 9, pp 968-981, September 1991.
[18] J. Kurose. "Open Issues and Challenges in Providing Quality of
Service Guarantees in High-Speed Networks", In Computer
Communication Review, 23(1), pp 6-15, 1993.
Shenker, Clark, Zhang [Page 33]
Internet Draft Integrated Service Architecture October 1993
[19] J. Hyman and A. Lazar. "MARS: The Magnet II Real-Time Scheduling
Algorithm", In Proceedings of SIGCOMM '91, pp 285-293, 1991.
[20] J. Hyman, A. Lazar, and G. Pacifici. "Real-Time Scheduling with
Quality of Service Constraints", In IEEE JSAC, Vol. 9, No. 9, pp
1052-1063, September 1991.
[21] J. Hyman, A. Lazar, and G. Pacifici. "Joint Scheduling and
Admission Control for ATS-based Switching Nodes", In Proceedings of
SIGCOMM '92, 1992.
[22] J. Hyman, A. Lazar, and G. Pacifici. "A Separation Principle
Between Scheduling and Admission Control for Broadband Switching",
In IEEE JSAC, Vol. 11, No. 4, pp 605-616, May 1993.
[23] V. Jacobson and S. Floyd "private communication", 1991.
[24] V. Jacobson "private communication", 1991.
[25] S. Jamin, S. Shenker, L. Zhang, and D. Clark. "An Admission
Control Algorithm for Predictive Real-Time Service", In Proceedings
of the Third International Workshop on Networking and Operating
System Support for Digital Audio and Video, 1992.
[26] C. Kalmanek, H. Kanakia, and S. Keshav. "Rate Controlled Servers
for Very High-Speed Networks", In Proceedings of GlobeCom '90, pp
300.3.1-300.3.9, 1990.
[27] R. Nagarajan and J. Kurose. "On Defining, Computing, and
Guaranteeing Quality-of-Service in High-Speed Networks", In
Proceedings of INFOCOM '92, 1992.
[28] A.~Parekh and R. Gallager. "A Generalized Processor Sharing
Approach to Flow Control- The Single Node Case", In Technical Report
LIDS-TR-2040, Laboratory for Information and Decision Systems,
Massachusetts Institute of Technology, 1991.
[29] A.~Parekh. "A Generalized Processor Sharing Approach to Flow
Control in Integrated Services Networks", In Technical Report LIDS-
TR-2089, Laboratory for Information and Decision Systems,
Massachusetts Institute of Technology, 1992.
[30] C. Partridge, "A Proposed Flow Specification" RFC-1363, July 1992.
[31] H. Schulzrinne "private communication", 1992.
[32] H. Schulzrinne, J. Kurose, and D. Towsley. "Congestion Control for
Real-Time Traffic", In Proceedings of INFOCOM '90.
Shenker, Clark, Zhang [Page 34]
Internet Draft Integrated Service Architecture October 1993
[33] S. Shenker "Service Models and Pricing Policies for an Integrated
Services Internet", to appear in Proceedings of "Public Access to
the Internet", Harvard University, 1993.
[34] S. Shenker, D. Clark, and L. Zhang. "A Scheduling Service Model
and a Scheduling Architecture for an Integrated Services Packet
Network" preprint, 1993.
[35] D. Verma, H. Zhang, and D. Ferrari. "Delay Jitter Control for
Real-Time Communication in a Packet Switching Network", In
Proceedings of TriCom '91, pp 35-43, 1991.
[36] C. Weinstein and J. Forgie. "Experience with Speech Communication
in Packet Networks", In IEEE JSAC, Vol. 1, No. 6, pp 963-980,
December 1983.
[37] D. Yates, J. Kurose, D. Towsley, and M. Hluchyj. "On Per-Session
End-to-End Delay Distribution and the Call Admission Problem for
Real Time Applications with QOS Requirements", In Proceedings of
SIGCOMM '93, to appear.
[38] L. Zhang, S. Deering, D. Estrin, S. Shenker, and D. Zappala, "RSVP:
A New Resource ReSerVation Protocol", Accepted for publication in
IEEE Network, 1993.
A. On Standardizing a Service Model
Let us assume, for the sake of argument, that the Internet community
agrees to adopt a service model similar in spirit to the one proposed
here. There is then the question of how one "standardizes" the service
model. There are two approaches. First, one could identify a single
packet forwarding algorithm which supports this service model and then
require that all routers use this algorithm. This entails standardizing
the detailed packet scheduling mechanisms in the routers. It is not
clear that all router technologies will be able to implement this
particular packet scheduling mechanism, and so this approach may limit
the range of technologies that can be employed in the Internet. One
expects, in fact for the sake of progress one hopes, that the future
Internet will have a diverse set of underlying technologies, and so
requiring uniformity of the packet forwarding algorithm is probably not
realistic nor desirable. The second approach involves adopting the
service model without specifying the underlying mechanism. This path,
while not nearly as straightforward (in fact, it poses a special
challenge to the Internet's standardization procedures), is far more
flexible. In this second approach there are several different
conceptual issues which must be dealt with: (1) what services will be
Shenker, Clark, Zhang [Page 35]
Internet Draft Integrated Service Architecture October 1993
offered, (2) how are those services requested by the application, and
(3) how are those services provided by the network. In this section we
briefly address these three issues.
A.1 Defining the Services
There are two separate components to defining the set of services
offered by the network: the service model and the service
specification.
Service Model
This is the basic set of services offered by the network
and, as such, is the central expression of the network
architecture. As we have argued previously, the service
model should be based on fundamental application
requirements. We have proposed a core service model in this
memo. For individual flows it provides for two kinds of
real-time service, guaranteed service and predictive
service, along with multiple levels of elastic service.
The service model also provides for hierarchical link-
sharing services between collective entities.
Service Specification
This is the detailed parameterization of the service model.
This specification details how the service delivered to
flows is characterized (e.g., delay bounds, etc.). In
addition, the service specification details both how flows
characterize themselves to the network (e.g., token bucket,
leaky bucket, peak rate, etc.), and how these
characterizations are enforced the network (e.g., by
dropping or delaying packets, at entry points or at every
router, etc.). While the service model is derived from
rather general principles, the service specification
involves making a number of detailed (and perhaps
arbitrary) engineering choices.
A.2 Obtaining the Services
There are three kinds of services: link-sharing, elastic, and real-
time. The link-sharing services will presumably be controlled
through a network management interface. Since this network
management interface will not typically be invoked by widely-used
applications, there are few compatibility constraints. Thus, this
interface can gradually evolve and so we need not be concerned with
making its definition precise now. Since providing elastic service
requires no preallocation of resources, we presume that applications
Shenker, Clark, Zhang [Page 36]
Internet Draft Integrated Service Architecture October 1993
utilizing elastic service will not need to pass through admission
control. These elastic service desires (i.e., which level of ASAP
service, and the preemptability of the packets) will probably be
specified by the application in the interface to the transport
protocol, and this will in turn be communicated to the network
through some indication in the individual packet headers. We assume
that this will be addressed in some other standardization venue, and
we will not address it further here.
In contrast, providing the real-time services does require
preallocation of network resources. Applications desiring real-time
service will have to first explicitly request that service, and this
request involves reserving network resources. The reservation
procedure has two steps; first the application will invoke some
operating system interface to request the reservation, and then some
"set-up" or "reservation" protocol will forward that request to the
network and return an answer. The application logically sees a
request-response semantics through some operating system interface;
the routers interact not with the application but with the
reservation protocol and that can have a different interface. The
set-up protocol has its own "service model" of the various
configurations of reserved state it can provide; these were deemed
"reservation styles" in [10] (we are not advocating the particular
reservation styles in [10] but are merely citing them as examples of
nontrivial relationships between flows and reserved resources). As
an example of this we note that so far in our exploration of the
service model, we have discussed only the service given to actual
flows (that is, flows whose packets are forwarded by the network).
However, one can reserve resources for potential flows as well;
potential flows are those whose packets are not currently being
forwarded, but for whom resources have been reserved.
Thus, in defining how applications obtain real-time services, we must
deal with the reservation model of the set-up protocol and not just
the service model of the network itself. We also must describe what
information is passed between the applications and the network, and
then must provide a detailed description of the interface invoked by
applications. More specifically, the three conceptual pieces are:
Reservation Model
The reservation model describes what configurations of
resources can be reserved. The reservation model must not
only address the service model available to individual
flows, but must also incorporate more general
configurations of reserved resources.
Shenker, Clark, Zhang [Page 37]
Internet Draft Integrated Service Architecture October 1993
Negotiation Model
The negotiation model describes, at an architectural level,
(1) what parameters the application hands to the operating
system interface, and (2) what parameters the application
gets back from that interface. The negotiation model will
depend on, and may therefore dictate, which end (source,
receiver) of the application is submitting the requests.
The negotiation model will also have implications for the
set of queries and responses implemented in the admission
control algorithm.
Reservation Interface
This is a detailed parameterization (essentially the API)
of the negotiation model. This reservation interface will
be the artifact that is invoked by applications. It should
be designed to be properly extensible, so that new services
can be added, but it will inevitably be subject to
compatibility constraints and therefore the previously
defined components will be largely immutable.
A.3 Providing the Services
The previous two sections specify the range of services available,
and how an application can obtain them. If there were a single
network provider, then we would just require that the network support
the interface to the set-up protocol and deliver the desired service.
However, the Internet is, and will likely continue to be, a very
heterogeneous and administratively decentralized institution. The
service delivered to a flow is a concatenation of the service
provided by the various routers along its path, and these routers
need not implement the same packet forwarding algorithms. Thus, we
need to directly address how we can be assured that such a network,
with local operators making decisions about which routers to install,
will indeed support the service model. As mentioned previously, one
approach is to insist on a single router mechanism. The approach we
advocate is, instead, to provide a functional requirement on routers
rather than a definition of the detailed mechanism.
Router Interoperability Requirements
This specifies a set of criteria that a router has to
conform to. There are two categories of criteria. First,
Shenker, Clark, Zhang [Page 38]
Internet Draft Integrated Service Architecture October 1993
the routers must implement the interface used by the set-up
protocol, and the admission control algorithm must support
the appropriate set of queries. Incorporated within this
is something functionally equivalent to what is described
in RFC 1363 [3], which describes the set of parameters
handed to routers along the path of a flow. Second, the
service delivered by the router must conform to certain
standards; these standards are designed so that the service
delivered by a heterogeneous set of conforming routers will
obey the service model. For guaranteed service, one might
require that routers must approximate the WFQ "fluid
model", as defined in []. One can express the accuracy to
which a router supports the fluid model with an error term
which can be carried in the reservation interface as it is
passed between routers and added up to compute the
resulting end-to-end delay bounds. We should note that
this is just one example; there are other alternatives for
specifying how to deliver guaranteed service. For
predictive service, the issue is much more difficult. Here
the performance depends crucially on the admission control
algorithm [Note 14], and it is difficult to accurately
characterize a measurement-based admission control
algorithm. We propose that the performance of such
algorithms be characterized by their performance on various
test suites. These test suites will reveal not only the
degree to which the delay bounds are met, but also the
level of network utilization obtained (which depends on the
behavior of the admission control algorithm). How one
defines and evaluates such test suites is an unexplored yet
crucial issue.
_________________________
[Note 14] The guaranteed service depends on admission control as well,
but for guaranteed service there is a clear correctness condition for
admission control. There is no such clear correctness condition for
predictive admission control.
Shenker, Clark, Zhang [Page 39]